{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "62fc336ca6e8ff3e",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "# Integrating with MLflow"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b69fe4a00cb5fd9d",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "MLflow is an open-source platform designed for managing the end-to-end machine learning lifecycle. It provides functionalities for tracking experiments, packaging code into reproducible runs, and sharing and deploying models. \n",
    "\n",
    "NeuralProphet is compatible with MLflow and we can track our jobs on the MLflow platform."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "68a3a88dce782e94",
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# for this tutorial, we need to install MLflow.\n",
    "# !pip install mlflow\n",
    "\n",
    "# Start a MLflow tracking-server on your local machine\n",
    "# !mlflow server --host 127.0.0.1 --port 8080\n",
    "\n",
    "if \"google.colab\" in str(get_ipython()):\n",
    "    # uninstall preinstalled packages from Colab to avoid conflicts\n",
    "    !pip uninstall -y torch notebook notebook_shim tensorflow tensorflow-datasets prophet torchaudio torchdata torchtext torchvision\n",
    "    !pip install git+https://github.com/ourownstory/neural_prophet.git  # may take a while\n",
    "\n",
    "# much faster using the following code, but may not have the latest upgrades/bugfixes\n",
    "# pip install neuralprophet"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "325dbd0616952d9f",
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from neuralprophet import NeuralProphet, set_log_level, save\n",
    "import mlflow\n",
    "import time\n",
    "\n",
    "set_log_level(\"ERROR\")\n",
    "\n",
    "data_location = \"https://raw.githubusercontent.com/ourownstory/neuralprophet-data/main/datasets/\"\n",
    "df = pd.read_csv(data_location + \"air_passengers.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "44e39e454140e92",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Setting Up the MLflow Tracking Server"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c999a9f7b26842a",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "In this step, we're configuring MLflow to use a tracking server for logging and monitoring our machine learning experiments. The tracking server acts as a central repository for MLflow to store experiment data. This includes information like model parameters, metrics, and output files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "e7d835e606e97447",
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Set variable 'local' to True if you want to run this notebook locally\n",
    "local = False"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "a80a4c8d1627e83f",
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Set our tracking server uri for logging\n",
    "mlflow.set_tracking_uri(uri=\"http://127.0.0.1:8080\") if local else None"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1cadfc2029a9f855",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## End Previous Run"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ec9be45ccf0b27b8",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "If you have an active run before you start logging and monitoring this will throw an error. Therefore make sure you end all previous active runs. In a normal setting you should not have any active runs and you can ignore the following cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "24b23a7d5b203363",
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# End previous run if any\n",
    "# mlflow.end_run()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e7699b83324ea1f6",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Starting an MLflow Experiment with NeuralProphet"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "63ebf38330c44d91",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "In the next step, we'll delve into initiating and managing an MLflow experiment for training a NeuralProphet model. The focus will be on setting up the experiment, defining model hyperparameters, and logging essential training metrics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "ce5870177ff63aad",
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Start a new MLflow run\n",
    "if local:\n",
    "    with mlflow.start_run():\n",
    "        # Create a new MLflow experiment\n",
    "        mlflow.set_experiment(\"NP-MLflow Quickstart_v1\")\n",
    "\n",
    "        # Set a tag for the experiment\n",
    "        mlflow.set_tag(\"Description\", \"NeuralProphet MLflow Quickstart\")\n",
    "\n",
    "        # Define NeuralProphet hyperparameters\n",
    "        params = {\n",
    "            \"n_lags\": 5,\n",
    "            \"n_forecasts\": 3,\n",
    "        }\n",
    "\n",
    "        # Log Hyperparameters\n",
    "        mlflow.log_params(params)\n",
    "\n",
    "        # Initialize NeuralProphet model and fit\n",
    "        start = time.time()\n",
    "        m = NeuralProphet(**params)\n",
    "        metrics_train = m.fit(df=df, freq=\"MS\")\n",
    "        end = time.time()\n",
    "\n",
    "        # Log training duration\n",
    "        mlflow.log_metric(\"duration\", end - start)\n",
    "\n",
    "        # Log training metrics\n",
    "        mlflow.log_metric(\"MAE_train\", value=list(metrics_train[\"MAE\"])[-1])\n",
    "        mlflow.log_metric(\"RMSE_train\", value=list(metrics_train[\"RMSE\"])[-1])\n",
    "        mlflow.log_metric(\"Loss_train\", value=list(metrics_train[\"Loss\"])[-1])\n",
    "\n",
    "        # save model\n",
    "        model_path = \"np-model.np\"\n",
    "        save(m, model_path)\n",
    "\n",
    "        # Log the model in MLflow\n",
    "        mlflow.log_artifact(model_path, \"np-model\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cbf9fff72904c5e9",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## View the NeuralProphet Run in the MLflow UI"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1c67c63d0f0aaff4",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "In order to see the results of our run, we can navigate to the MLflow UI. Since we have already started the Tracking Server at http://localhost:8080, we can simply navigate to that URL in our browser and observe our experiments. \n",
    "If we click on the respective experiments we can see a list of all runs associated with the experiment. Clicking on the run will take us to the run page, where the details of what we’ve logged will be shown."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7fb27b941602401d91542211134fc71a",
   "metadata": {},
   "source": [
    "## Advanced Example \n",
    "with: MLflow Autologging, dataset, requirements & environment metadata with model signature inference, another MLflow Experiment with NeuralProphet"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "acae54e37e7d407bbb7b55eff062a284",
   "metadata": {},
   "outputs": [],
   "source": [
    "# MLflow setup\n",
    "# Run this command with environment activated: mlflow ui --port xxxx (e.g. 5000, 5001, 5002)\n",
    "# Copy and paste url from command line to web browser\n",
    "\n",
    "import mlflow\n",
    "from mlflow.data.pandas_dataset import PandasDataset\n",
    "\n",
    "if local:\n",
    "    mlflow.pytorch.autolog(\n",
    "        log_every_n_epoch=1,\n",
    "        log_every_n_step=None,\n",
    "        log_models=True,\n",
    "        log_datasets=True,\n",
    "        disable=False,\n",
    "        exclusive=False,\n",
    "        disable_for_unsupported_versions=False,\n",
    "        silent=False,\n",
    "        registered_model_name=None,\n",
    "        extra_tags=None,\n",
    "    )\n",
    "\n",
    "    import mlflow.pytorch\n",
    "\n",
    "    model_name = \"NeuralProphet\"\n",
    "\n",
    "    with mlflow.start_run() as run:\n",
    "        dataset: PandasDataset = mlflow.data.from_pandas(df, source=\"AirPassengersDataset\")\n",
    "\n",
    "        # Log the dataset to the MLflow Run. Specify the \"training\" context to indicate that the\n",
    "        # dataset is used for model training\n",
    "        mlflow.log_input(dataset, context=\"training\")\n",
    "\n",
    "        mlflow.log_param(\"model_type\", \"NeuralProphet\")\n",
    "        mlflow.log_param(\"n_lags\", 8)\n",
    "        mlflow.log_param(\"ar_layers\", [8, 8, 8, 8])\n",
    "        mlflow.log_param(\"accelerator\", \"gpu\")\n",
    "\n",
    "        # To Train\n",
    "        # Import the NeuralProphet class\n",
    "        from neuralprophet import NeuralProphet, set_log_level\n",
    "\n",
    "        # Disable logging messages unless there is an error\n",
    "        set_log_level(\"ERROR\")\n",
    "\n",
    "        # Create a NeuralProphet model with default parameters\n",
    "        m = NeuralProphet(n_lags=8, ar_layers=[8, 8, 8, 8], trainer_config={\"accelerator\": \"gpu\"})\n",
    "\n",
    "        # Use static plotly in notebooks\n",
    "        m.set_plotting_backend(\"plotly-resampler\")\n",
    "\n",
    "        # Fit the model on the dataset\n",
    "        metrics = m.fit(df)\n",
    "\n",
    "        df_future = m.make_future_dataframe(df, n_historic_predictions=48, periods=12)\n",
    "\n",
    "        # Predict the future\n",
    "        forecast = m.predict(df_future)\n",
    "\n",
    "    mlflow.metrics.mae()\n",
    "\n",
    "    # Save conda environment used to run the model\n",
    "    mlflow.pytorch.get_default_conda_env()\n",
    "\n",
    "    # Save pip requirements\n",
    "    mlflow.pytorch.get_default_pip_requirements()\n",
    "\n",
    "    # Registering model\n",
    "    model_uri = f\"runs:/{run.info.run_id}/NeuralProphet_test\"\n",
    "    mlflow.register_model(model_uri=model_uri, name=model_name)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.0rc1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
